Skip to content

Support Anthropic API /v1/messages Endpoint #22627

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

LiuLi1998
Copy link

@LiuLi1998 LiuLi1998 commented Aug 11, 2025

This PR adds support for the Anthropic /v1/messages REST API endpoint to the vLLM FastAPI server.

  • Support /v1/messages API
  • Compatibale with all existed tool call parser in OpenAI API

@LiuLi1998 LiuLi1998 requested a review from aarnphm as a code owner August 11, 2025 08:20
@mergify mergify bot added the frontend label Aug 11, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for the Anthropic Messages API by adding a new API server, protocol definitions, and a serving layer for format conversion. The implementation is based on the existing OpenAI-compatible server. My review has identified several critical and high-severity issues, including a potential NoneType access error, incorrect Pydantic model usage that could lead to validation errors, a risk of generating duplicate tool call IDs, and another case of incorrect attribute access on a Pydantic model that would cause a runtime error. I have provided specific code suggestions to address these issues and ensure the stability and correctness of the new endpoint.

Comment on lines +105 to +106
return messages(raw_request).create_error_response(
message="The model does not support Chat Completions API")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

When handler is None, messages(raw_request) is also None. Calling create_error_response on a None object will raise an AttributeError, causing an unhandled exception and a 500 server error. You should construct an ErrorResponse directly to ensure a proper error is returned. You will need to import ErrorResponse from vllm.entrypoints.openai.protocol and HTTPStatus from http.

Suggested change
return messages(raw_request).create_error_response(
message="The model does not support Chat Completions API")
return ErrorResponse(message="The model does not support Chat Completions API",
type="model_not_found",
code=HTTPStatus.NOT_FOUND.value)

req.tool_choice = ChatCompletionNamedToolChoiceParam.model_validate({
"type": "function",
"function": {
"name": anthropic_request.tool_choice.get("name")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

anthropic_request.tool_choice is a Pydantic model instance, not a dictionary. Accessing its attributes should be done with dot notation (e.g., .name). Using .get("name") will result in an AttributeError at runtime.

Suggested change
"name": anthropic_request.tool_choice.get("name")
"name": anthropic_request.tool_choice.name


class AnthropicMessagesResponse(BaseModel):
"""Anthropic Messages API response"""
id: str
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The id field is defined as a required field. The model_post_init method, which attempts to set a default value, is called after Pydantic's validation. If id is not provided during initialization, a ValidationError will be raised before model_post_init can execute. To correctly provide a default value for an optional field, you should use default_factory in the field definition and remove the model_post_init method.

Suggested change
id: str
id: str = Field(default_factory=lambda: f"msg_{int(time.time() * 1000)}")

elif block.type == "tool_use":
# Convert tool use to function call format
tool_call = {
"id": block.id or f"call_{int(time.time())}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using int(time.time()) to generate tool call IDs is not safe as it can produce duplicate IDs for tool calls created in the same second. This can lead to incorrect behavior when matching tool calls to their results. It's better to use a UUID-based approach for uniqueness. You can use random_tool_call_id from vllm.entrypoints.chat_utils for this, which needs to be imported.

Suggested change
"id": block.id or f"call_{int(time.time())}",
"id": block.id or random_tool_call_id(),

Comment on lines +239 to +244
content: List[AnthropicContentBlock] = [
AnthropicContentBlock(
type="text",
text=generator.choices[0].message.content
)
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

A text AnthropicContentBlock is created even if generator.choices[0].message.content is None. This can lead to an invalid content block, as the Anthropic API requires the text field for text blocks. When serialized with exclude_none=True, this would result in an invalid content block. You should only create the text content block if there is content available.

Suggested change
content: List[AnthropicContentBlock] = [
AnthropicContentBlock(
type="text",
text=generator.choices[0].message.content
)
]
content: List[AnthropicContentBlock] = []
if generator.choices[0].message.content:
content.append(
AnthropicContentBlock(
type="text",
text=generator.choices[0].message.content))

Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@mgoin
Copy link
Member

mgoin commented Aug 11, 2025

Exciting! Make sure to add some unit tests before ready and I wonder if we can also include a smoke test that claude code/some other application can communicate with the API correctly

@mgoin mgoin self-requested a review August 11, 2025 19:09
@njhill
Copy link
Member

njhill commented Aug 11, 2025

Thanks @LiuLi1998! Would you also be willing to help with ongoing support/maintenance of the API?

@LiuLi1998
Copy link
Author

Exciting! Make sure to add some unit tests before ready and I wonder if we can also include a smoke test that claude code/some other application can communicate with the API correctly

Thanks for the input! I agree — I’ll add some tests soon to make sure everything works as expected.

@LiuLi1998
Copy link
Author

Thanks @LiuLi1998! Would you also be willing to help with ongoing support/maintenance of the API?

Definitely! I’m glad to take part in the support/maintenance of the API

@LiuLi1998
Copy link
Author

Exciting! Make sure to add some unit tests before ready and I wonder if we can also include a smoke test that claude code/some other application can communicate with the API correctly

I've added initial tests.I'm not entirely sure if the current approach follows best practices or covers everything needed—would really appreciate your feedback on improvements or any other cases

Signed-off-by: liuli <[email protected]>
@mgoin mgoin changed the title Support Anthropic API Endponit Support Anthropic API /v1/messages Endpoint Aug 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants